Skip to content

Isolation Scheduler Hosts Sorting | Add secondary sort (No of free slots) and tertiary sort (Hostname)#8709

Open
meetjain74 wants to merge 1 commit into
apache:masterfrom
meetjain74:patch-2
Open

Isolation Scheduler Hosts Sorting | Add secondary sort (No of free slots) and tertiary sort (Hostname)#8709
meetjain74 wants to merge 1 commit into
apache:masterfrom
meetjain74:patch-2

Conversation

@meetjain74
Copy link
Copy Markdown
Contributor

What is the purpose of the change

All things are mentioned in the Issue #8708

How was the change tested

Deployed topologies after the changes and is working as expected.

@reiabreu
Copy link
Copy Markdown
Contributor

Hi @meetjain74 ! What do you think about adding some unit tests for this? Thank you

final Map<String, Integer> hostFreeSlotCount = new HashMap<String, Integer>();
for (WorkerSlot slot : cluster.getAvailableSlots()) {
String host = cluster.getHost(slot.getNodeId());
if (hostAssignableSlots.containsKey(host)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we iterate over hostAssignableSlots in the outer loop? This will reduce the number of iterations, since hostAssignableSlots contains the target hosts

}
}

List<HostAssignableSlots> sortHostAssignSlots = new ArrayList<HostAssignableSlots>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ran this through an LLM.

Suggested approach to avoid doing Map lookup in every sort iteration

 1. Add freeSlots to HostAssignableSlots:

  1     class HostAssignableSlots {
  2         private final int freeSlots; 
  3         // ... update constructor and add getter
  4     }

  2. Calculate once per host, then sort:

   1     // Build objects with pre-calculated metadata
   2     for (Map.Entry<String, List<WorkerSlot>> entry : hostAssignableSlots.entrySet()) {
   3         int free = (int) entry.getValue().stream().filter(cluster::isSlotAvailable).count();
   4         sortHostAssignSlots.add(new HostAssignableSlots(entry.getKey(), entry.getValue(), free));
   5     }
   6
   7     // Faster Comparator (Direct field access)
   8     Collections.sort(sortHostAssignSlots, (o1, o2) -> {
   9         int bySlots = o2.getWorkerSlots().size() - o1.getWorkerSlots().size();
  10         if (bySlots != 0) return bySlots;
  11         
  12         int byFree = o2.getFreeSlots() - o1.getFreeSlots();
  13         if (byFree != 0) return byFree;
  14         
  15         return o1.getHostName().compareTo(o2.getHostName());
  16     });
`

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will pick and fix this and update the PR

@rzo1 rzo1 added this to the 3.0.0 milestone May 31, 2026
Copy link
Copy Markdown
Contributor

@rzo1 rzo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, and +1 to @reiabreu's two points — they're both on the mark. A few things to add on top.

On the comparator refactor (@reiabreu's line 334 suggestion): it's correct and worth doing — I checked the Cluster internals and getAvailablePorts() = getAssignablePorts() − usedPorts (Cluster.java:411-418), so a host's available slots are always a subset of its assignable slots. That means counting free slots among each host's own assignable list is exactly equivalent to the current "bucket getAvailableSlots() by host" approach — safe to switch, and it lifts the map lookup out of the comparator. One gotcha: the suggested snippet uses cluster::isSlotAvailable, which doesn't exist — the method is cluster.isSlotOccupied(WorkerSlot), so the predicate needs to be slot -> !cluster.isSlotOccupied(slot) (or just keep deriving the count from getAvailableSlots() and pass it into the constructor).

On unit tests (@reiabreu's earlier ask): I'd treat this as the one blocker. There's currently no focused IsolationScheduler test anywhere in the tree, so this selection path is entirely uncovered and the value of the change is unverifiable. A small test pinning both new keys would do it:

  • two hosts, equal assignable slots, different free-slot counts → assert the host with more free slots (fewer evictions) is selected first;
  • two hosts equal on both → assert the lexicographic hostname tiebreak.

The tertiary hostname sort is a nice fix in its own right — the comparator's input came from HashMap.entrySet(), so equal-ranked hosts were previously ordered non-deterministically.

LGTM on the logic; I'd just want the unit test before merge. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants